This system is built using N8N, a workflow automation tool, and is composed of four main steps: initialisation, scheduled scraping of source posts, scheduled parasite flow, and scheduled LinkedIn poster. The data management primarily relies on Google Sheets, which are referred to as ‘databases’.
Step 1: Initialisation – Finding High-Quality Creators
The first stage involves identifying a list of top creators in your niche whose posts consistently perform well.
1. Set up N8N and Google Sheet:
- Ensure your N8N instance is ready.
- Prepare a Google Sheet to store creator information. This sheet will serve as your first ‘database’ and should have columns like ‘LinkedIn URL’, ‘Name’, ‘Headline’, ‘Posted At’, and ‘Likes on Post’.
2. Scrape LinkedIn Posts using Apify:
- Objective: Automatically scrape LinkedIn posts based on a search term to find popular content and identify viral creators.
- Tool: We suggest using Apify due to LinkedIn’s strong anti-scraping protections.
- Process:
- Go to Apify’s marketplace and search for a LinkedIn post scraper.
- Test the scraper manually in Apify’s dashboard first to ensure it yields results (e.g., searching for “coding”).
- Integrate Apify with N8N via HTTP Request node: Find Apify’s API documentation. Obtain your Apify API token from the integrations page and use it in the authorisation header of your N8N HTTP Request node as a Bearer token.
- Identify the Actor ID for the chosen Apify scraper. This is typically found in the URL of the Apify console when you’re viewing the actor (e.g.,
console.appify.com/actor/<ACTOR_ID>/input). - Configure the N8N HTTP Request node: Copy the URL command from Apify’s API docs and paste it into N8N’s import feature. This will auto-map the URL, method (POST), and authentication headers.
- Add the JSON body with your search query (e.g.,
{"keyword": "coding"}), and set the maximum number of results you want to scrape (e.g., 200 posts).
3. Filter and Store High-Quality Creators:
- After scraping, you’ll have a list of posts with engagement data (likes, comments, shares).
- Use a ‘Filter’ node in N8N to narrow down the posts. For instance, filter for posts with more than 100 likes (e.g.,
JSON.engagement.likes > 100). - From the filtered posts, extract the author’s LinkedIn URL, name, headline, and engagement metrics.
- Use a ‘Google Sheet’ node (Append Row) to add these high-quality creator details to your ‘creators’ Google Sheet. This ensures you’re populating your database with people who consistently go viral in your niche.
Step 2: Scheduled Scraping of Source Posts
This step focuses on regularly collecting the most recent and viral posts from the high-quality creators identified in Step 1.
1. Set up Schedule Trigger:
Add a ‘Schedule Trigger’ node in N8N, setting it to run at your desired cadence (e.g., once a week or once a day). This will initiate the flow for scraping new source posts.
2. Read Creator Data:
Add a ‘Google Sheets’ node (Get All Rows) to read all the LinkedIn URLs of the creators from your ‘creators’ Google Sheet.
3. Scrape Recent Posts from Creators:
Objective: For each creator, scrape their most recent high-quality posts. The Apify actor can filter posts by age (e.g., last 24 hours or last 7 days), which helps avoid duplicates.
Tool: Apify’s ‘Profile Post Bulk Scraper’ (or similar).
Process: Aggregate the LinkedIn URLs from the ‘Get All Rows’ node into a single array using an ‘Aggregate’ node, mapping only the ‘LinkedIn URL’ field.
Use an HTTP Request node to call the Apify API for the ‘Profile Post Bulk Scraper’. You’ll need to update the Actor ID to this new scraper’s ID and provide the aggregated LinkedIn URLs in the JSON body under a key like targetUrls.
Ensure the JSON body correctly formats the targetUrls as an array of strings. You might need to use JSON.stringify to convert the array to a proper JSON string if N8N doesn’t handle it automatically.
4. Store Source Post Data:
Add a ‘Google Sheet’ node (Append Row) to store the scraped post data. This will be your second ‘database’, called ‘source posts’.
Map the relevant fields from the scraped data to your sheet’s columns. Recommended fields include: ‘ID’, ‘Post URL’, ‘Content’, ‘Author LinkedIn Profile URL’, ‘Posted At’, and up to three ‘Post Image URLs’ (e.g., postImage1URL, postImage2URL, postImage3URL).
Minimise API Calls: In the Google Sheet node, enable ‘Minimize API Calls’ to reduce the number of requests, as large data dumps can sometimes cause issues with Google Sheets integration.
Step 3: Scheduled Parasite Flow – AI Content Generation
This is where the magic happens: transforming the scraped content into unique, high-quality posts using AI.
1. Set up Schedule Trigger:
Add another ‘Schedule Trigger’ node, set to run once per day (or desired frequency).
2. Read Source Posts:
Add a ‘Google Sheets’ node (Get All Rows) to read all posts from your ‘source posts’ sheet.
Implement a ‘Loop over items split in batches’ node, setting the batch size to 1. This is crucial for managing AI rate limits and ensuring each post is processed individually. All subsequent AI processing steps will occur within this loop.
3. AI Research and Content Generation (within the loop):
A – Find Web Data (OpenAI – GPT-4 Search Preview):
Objective: Find additional relevant information online to enrich the content.
Tool: OpenAI’s GPT4 search preview model.
Prompt: Act as an intelligent research assistant. Take the social media post content as input and find three sources discussing similar topics. Return brief, detailed summaries of each source in a JSON format (e.g., {"sourceOneSummary": "...", "sourceTwoSummary": "...", "sourceThreeSummary": "..."}).
B – Analyse Image (OpenAI – GPT-4 Latest) (Optional):
Objective: If the source post has an image, generate a comprehensive description of it to be incorporated into the new content.
Process: Add a ‘Switch’ node before the image analysis. Configure it to check if postImage1URL exists (i.e., is not empty).
If an image exists (route 0): Use an ‘OpenAI’ node (Analyze Image operation) with GPT-4 latest model. Provide the postImage1URL and a prompt like “Describe this extremely comprehensively”.
If no image exists (route 1): Directly connect this route to the ‘Generate Unique Outline’ step.
Merge the outputs of both routes after the image analysis (or lack thereof) using a ‘Merge’ node, ensuring the subsequent steps receive the necessary data regardless of whether an image was present.
C – Generate Unique Outline (OpenAI – GPT-4.1):
Objective: Create a detailed, comprehensive, and unique outline for the new LinkedIn post, incorporating elements from the web research and potentially the image description.
Tool: OpenAI’s GPT-4.1 (or latest available).
Prompt: Act as a helpful, intelligent writing assistant. Take the original social media post content, the web research summaries, and the optional image description as input. Generate a detailed, unique outline of the content, ensuring it’s not a verbatim copy but an improved, more comprehensive, or twisted version.
Emphasise improving the original, not just copying it. Output the outline in Markdown ATX format, aiming for 5-10 headings.
D – Regenerate New Content (OpenAI – GPT-4.5):
Objective: Write the final high-quality LinkedIn post based on the generated outline, adhering to a specific tone of voice and character limits.
Tool: OpenAI’s GPT-4.5 (or latest, though more expensive).
Prompt: Act as a helpful, intelligent writing assistant. Take the outline as input and write a high-quality LinkedIn post.
Crucially, define your desired tone of voice and writing rules. For example, “Do not be overly engaging,” “be spartan and relatively informal but maintain a professional curious tone”.
Provide examples: To improve quality and ensure the AI matches your tone, include examples of your own LinkedIn posts and their corresponding outlines in the prompt’s context. The creator found reverse-engineering their own posts into outlines and then using them as examples significantly improved the AI’s output.
Output the LinkedIn post in JSON format with a postBody field.
4. Store Generated Content:
Add a ‘Google Sheet’ node (Append Row) to store the newly generated content. This will be your third ‘database’, called ‘destination posts’.
Map the following fields: ‘ID’ (from original post), ‘Original Post URL’, ‘Generated Content’ (from AI), ‘Original LinkedIn URL’ (of creator), ‘Generated At’ (use {{$now}}), and ‘Post Status’ (set this to ‘Draft’ initially).
5. Loop Back and Wait:
Connect the ‘Append Row’ node back to the ‘Loop over items split in batches’ node. This ensures the flow processes each item sequentially.
Insert a ‘Wait’ node (e.g., 5 seconds) after the Google Sheet append and before looping back. This helps prevent rate limiting issues with Google Sheets or other APIs if processing many items.
Step 4: Scheduled LinkedIn Poster
The final step is to automatically publish the ‘Draft’ posts to LinkedIn and update their status.
1. Set up Schedule Trigger:
Add another ‘Schedule Trigger’ node, configured to run once a day.
2. Read Draft Posts:
Add a ‘Google Sheets’ node (Get All Rows) to read from your ‘destination posts’ sheet.
Add a ‘Filter’ node to retrieve only posts where ‘Post Status’ is equal to ‘Draft’.
Use a ‘First Matching Row’ operation (or similar logic) after the filter to select only one post to publish per run, to avoid posting too frequently.
3. Publish to LinkedIn:
Objective: Create a post on LinkedIn using the generated content.
Tool: N8N’s ‘LinkedIn’ node (Create a Post operation).
Process: Connect your LinkedIn account: In N8N, create a new credential for LinkedIn. If posting as a person, turn off ‘Organisation support’ and keep ‘Legacy’ on. Configure the ‘Create a Post’ node: Set ‘Resources’ to ‘Post Operations’ and ‘Operation’ to ‘Create Post as Person’.
Map the postBody from your generated content (from the previous steps in the loop) to the ‘Content’ field in the LinkedIn node.
Set ‘Visibility’ (e.g., ‘Connections’ for testing, or ‘Public’ for broader reach).
Test the posting logic manually with a simple “Hello world” post to ensure it works before using generated content.
4. Update Post Status:
Add a ‘Google Sheet’ node (Update Row in Sheet).
Select your ‘destination posts’ shee.
Identify the row to update using the ‘Original Post ID’ from the post that was just published.
Set the ‘Post Status’ column to ‘Published’ for that row. This is crucial to prevent the system from re-posting the same content in subsequent runs.
By following these steps, you can build a fully autonomous LinkedIn posting machine that grows your audience with unique, high-quality content derived from popular posts in your niche.
