To Do: News-scrapper

Priority Project #
19 News-scrapper #: 6
19.01

News Lists

  • Issue is that the buttons trigger a refresh and slow.
  • Can one overcome this with a javascrpt flag and a press to do all button at the end

Users

  • Centrally scrape the content of the articles.  Check once the login of the user at their login stage, and track if good.  Then show in the title page if the login is confirmed as valid
Pending
19.01

LinkedIn:

  • Popup that checks that the LinkedIn login and password are successful
  • The first result should return the number of connections and estimate the time to download them, before proceeding.
    • Advise the user how long it will take to download and that a file will be emailed to them 
  • Email csv file - one-step directly after the scrape -  ie save a file in the database at the end of the scrap and email (ie merge the 3 buttons we have)
Complete
19.02

Update scrapers which have broken over time

WorkingNot working
  • CNBC
  • Teletrader
  • FT Markets
  • Economist
  • FIA
  • Matt Levine
  • ISDA
  • BIS
  • Clarus
  • Economist Expresso
  • FirstFT, FT Main
  • Reuters
  • Economist Special Reports
  • Risk
  • EACH
  • FN
  • Global Custodian
  • Securities Finance
- Complete
19.02

Set up a Payment gateway- Stripe? 

Pending
19.03

Favicon

  •  Not working, both in bookmark and in tab, but only when on the home page… Odd
Complete
19.03

User Password checks

  • Need to develop an algorithm to check the passwords upon login
Pending
19.04

Login

  • Add ability to register
- Complete
19.04

Deleting LinkedIn Users

  • Delete MY Linkedin contacts should not delete them but mark them as hidden
  • How to handle when we have a contact shared with multiple "owners"?
Pending
19.05

User-specific summary email

  • Once the broken scrapers are fixed, the main ‘ask’ is to make it “user-specific” (ie each user can define their own list of stories and categories to include in the summary email). So the following need to be linked to the logged user:
    • Headline categories (eg Macro, CV19, Corp, Cryp, Pol, Ukraine, Clearing, x)
    • What to include/exclude for the email
    • What economic stats are key to them
    • Their own list of favourites and commentaries
  • At the moment each of these fields are attached to the news article so we need to create a User-News field relationship to achieve this.  Does that make sense to you?
- Complete
19.05

LinkedIn export

  • Notes need to include the Employment and Education details
  • Searchable text for mission statement
  • Export function seems to be failing
Pending
19.06

Removed the CMS and Settings entities

Why isn't LinkedIn password appearing on the list in a User's profile?

- Complete
19.06

Scrape content

  • Check that the scraping of all the news sites works
  • Also ensure that the content and not just the headline are scraped saved in the ‘fullContent’ field in the news entity.
  • In the view, hovering over the title will display the full content
  • For websites where the content is behind a firewall, then login and scrape where available.
Pending
19.07
  • Review of security (Dashboard -changed to ROLE_USER)
- Complete
19.08

Settings

  • Add VCard code (Jeroen)
- Complete
19.09

Market Data

  • Future T+2 data….  
- Complete
19.10

Read count

  • The new count needs to be user specific. 
  • So this is the number of unread articles in the past 24 hours by user.   Service by user and source
  • The read all button should not affect archives
    • Perhaps in archive you can have unread shown separately. 
Complete
19.11

Economic stats

  • Highlight and suppress buttons not working
  • Button to use the Teletrader defaults
  • Button to remove your highlight or take standard
  • Or if blank use the default - over-ride
Complete
19.12

Subscription page/Memberships

  • Check whether a user has a membership upon opening the Subscription page
    • If not, create a New membership, initially a Free membership
    • The new Membership button shouldn't open the form, but just save the details as per the button
      • Include Today's date 
      • The new membership button shows all users to Non-Admin users
Complete
19.13

Pricing

The functions of the website are

  • LinkedIn scrape
  • News:
    • Single place to read articles
      • Summary access only
      • Hover for full article
    • Abilty to mark articles as read, to avoid re-reading
    • Abiliy to select key articles and send summary email 
    • Mark economic stats as favourite - to generate an email
    • See what others are liking - are you missing an important well read article?

 

  • Summary Read-only  
    • Full articles available via a link 
  • One-stop Read access
    • However
      •  
- Complete
19.14

CompanyDetails

  • The HideOther inputs not coming through into Live
- Complete
19.15

Bugs

  • subscriptions_buttons line 45
    • Make this dynamic?
  • source\index line 43
    • Make this dynamic?
  • Favicon issue on live (when favcicon file exists)
- Complete
19.17

User

Add time zones

Complete
19.17

Formatting

  • Login button
  • Subscription page to show when not logged in
  • Ability to look at each headline
  • User Edit page -
    • Show categories
    • Button to delete photo
    • Show photo
    • Hide Status from user
  • New stories
    • No like/read buttons
- Complete
19.18

Bugs

  • Merge the ‘Highlight/Standard/Low’ buttons into a pop-up.  To avoid confusion. 
  • Set User bug in Market Stats
Complete
19.19

 New sites to scrape

Complete
19.20

LinkedIn contact export

  • Develop a view of the contacts that can be expanded/contracted [like Excel]
  • Import photo and save file
  • Export file - two types: Full CSV and Outlook
    • Export to directory per user
    • Develop a concatenated Notes for the Outlook export
  • Control the number of exports (2 now)
  • Don't show the flashing screen.  Warn on timings
  • Automatically email file
- Complete
19.21

LinkedIN scrape (https://www.linkedin.com/in/stephen-j-nurse/)

  • Include the summary experience and the employment history
Complete
19.22

LinkedIn Contacts

  • Deleting LinkedIn Contacts -just the owner's contact and the languages spoken
Complete
19.23

Market stats

  • Scrape doesn't work in live
- Complete
19.24

Linkedin contacts 

  • Create a User-Settings upon scrape
- Complete
19.25

User Passwords

  • List the users in alphabetical order
  • When you edit a User Password it changes the User name
Complete
19.26

Bugs

  • Market Economic Stats scraper doesn't work in live.  Not just United States (filters).  Aman is re-writing the scrape 
  • Chron job for  MarketStats doesn't work 
- Complete
19.27

Users

  • Delete doesn't work
  • Include in user view.
    • Membership
- Complete
19.28

Other

  • Make the Economic Stats scrape time 1am and 8.30am
  • Cron does not include FirstFT
  • Add a button to test that the FT.com login is working
  • Full content button (FT) only shows in first block (unassigned)
  • Upon user login, run a check on the user's logins and passwords and determine access accordingly
- Complete
19.29

Security

  • Add role heirachy in security.yaml
- Complete
19.30

Memberships

  • Create a 30-day free trial period
    • Unable to extend
    • Link the systems login to LinkedIn login and other site logins to avoid gaming
    • Buttons to upgrade
- Complete
19.31

LinkedIn Contacts

  • Don't save a Language Spoken for the null case
Complete
19.33

User memberships

  • Dates
Complete
19.33

Economic Market Statistics

  • Chron job.  Refresh every 15mins
    • Button to refresh manually if >10mins 
  • Historical view by stat
- Complete
20.44

Favicon

- Complete
Loading…
Loading the web debug toolbar…
Attempt #